Deep Belief Nets for Topic Modeling Workshop on Knowledge-Powered Deep Learning for Text Mining (KPDLTM-2014)
نویسندگان
چکیده
Applying traditional collaborative filtering to digital publishing is challenging because user data is very sparse due to the high volume of documents relative to the number of users. Content based approaches, on the other hand, is attractive because textual content is often very informative. In this paper we describe large-scale content based collaborative filtering for digital publishing. To solve the digital publishing recommender problem we compare two approaches: latent Dirichlet allocation (LDA) and deep belief nets (DBN) that both find low-dimensional latent representations for documents. Efficient retrieval can be carried out in the latent representation. We work both on public benchmarks and digital media content provided by Issuu, an online publishing platform. This article also comes with a newly developed deep belief nets toolbox for topic modeling tailored towards performance evaluation of the DBN model and comparisons to the LDA model.
منابع مشابه
Topic Modeling and Classification of Cyberspace Papers Using Text Mining
The global cyberspace networks provide individuals with platforms to can interact, exchange ideas, share information, provide social support, conduct business, create artistic media, play games, engage in political discussions, and many more. The term cyberspace has become a conventional means to describe anything associated with the Internet and the diverse Internet culture. In fact, cyberspac...
متن کاملKnowledge-Powered Deep Learning for Word Embedding
The basis of applying deep learning to solve natural language processing tasks is to obtain high-quality distributed representations of words, i.e., word embeddings, from large amounts of text data. However, text itself usually contains incomplete and ambiguous information, which makes necessity to leverage extra knowledge to understand it. Fortunately, text itself already contains welldefined ...
متن کاملConcept drift detection in business process logs using deep learning
Process mining provides a bridge between process modeling and analysis on the one hand and data mining on the other hand. Process mining aims at discovering, monitoring, and improving real processes by extracting knowledge from event logs. However, as most business processes change over time (e.g. the effects of new legislation, seasonal effects and etc.), traditional process mining techniques ...
متن کاملExtensive Deep Belief Nets with Restricted Boltzmann Machine Using MapReduce Framework
Big data is a collection of data sets which is used to describe the exponential growth and availability of both ordered and amorphous data. It is difficult to process big data using traditional data processing applications. In many practical problems, deep learning is one of the machine learning algorithms that has received great popularity in both academia and industry due to its high-level ab...
متن کاملSemantic Deep Learning
Artificial intelligence and machine learning research is dedicated to building intelligent artifacts that can imitate or even transcend the cognitive abilities of human beings. To emulate human cognitive abilities with intelligent artifacts, one must first render machines capable of capturing critical aspects of sensory data, with adequate data representations and performing reasoning and infer...
متن کامل